\begin{aosachapter}{GPSD}{s:gpsd}{Eric Raymond}

%% Note: American spelling. --ARB

GPSD is a suite of tools for managing collections of GPS devices and other
sensors related to navigation and precision timekeeping, including
marine AIS (Automatic Identification System) radios and digital compasses. The main program, a service
daemon named \code{gpsd}, manages a collection of sensors and makes
reports from all of them available as a JSON object stream on a
well-known TCP/IP port.  Other programs in the suite include 
demonstration clients usable as code models and various diagnostic
tools.

GPSD is widely deployed on laptops, smartphones, and autonomous
vehicles including self-driving automobiles and robot submarines. It
features in embedded systems used for navigation, precision
agriculture, location-sensitive scientific telemetry, and network time
service. It's even used in the Identification-Friend-or-Foe system of
armored fighting vehicles including the M1 ``Abrams''main battle tank.

GPSD is a mid-sized project---about 43 KLOC, mainly in C and
Python---with a history under its current lead going back to 2005 and a
prehistory going back to 1997.  The core team has been stable at about
three developers, with semi-regular contributions from about two dozen
more and the usual one-off patches from hundreds of others.

GPSD has historically had an exceptionally low defect rate, as
measured both by auditing tools such as \code{splint}, \code{valgrind}, and Coverity
and by the incidence of bug reports on its tracker and elsewhere.
This did not come about by accident; the project has been very
aggressive about incorporating technology for automated testing, and
that effort has paid off handsomely.

GPSD is sufficiently good at what it does that it has coopted or
effectively wiped out all of its approximate predecessors and at least
one direct attempt to compete with it.  In 2010, GPSD won the first
Good Code Grant from the Alliance for Code Excellence.  By the time
you finish this chapter you should understand why.

\begin{aosasect1}{Why GPSD Exists}

GPSD exists because the application protocols shipped with GPSs and
other navigation-related sensors are badly designed, poorly
documented, and highly variable by sensor type and model. See
\cite{bib:gps-suck} for a detailed discussion; in particular, you'll 
learn there about the vagaries of NMEA 0183 (the sort-of standard for
GPS reporting packets) and the messy pile of poorly documented vendor
protocols that compete with it. 

If applications had to handle all this complexity themselves the
result would be huge amounts of brittle and duplicative code, leading
to high rates of user-visible defects and constant problems as
hardware gradually mutated out from under the applications.

GPSD isolates location-aware applications from hardware interface
details by knowing about all the protocols itself (at time of writing
we support about 20 different ones), managing serial and USB devices 
so the applications don't have to, and reporting sensor payload
information in a simple device-independent JSON format.  GPSD further
simplifies life by providing client libraries so client applications
need not even know about that reporting format.  Instead, getting
sensor information becomes a simple procedure call.

GPSD also supports precision timekeeping; it can act as a time source
for \code{ntpd} (the Network Time Protocol Daemon) if any of its
attached sensors have PPS (pulse-per-second) capability. The GPSD
developers cooperate closely with the \code{ntpd} project in improving
the network time service.

We are presently (mid-2011) working on completing support for the AIS
network of marine navigational receivers.  In the future, we expect to
support new kinds of location-aware sensors---such as receivers for
second-generation aircraft transponders---as protocol documentation
and test devices become available.

To sum up, the single most important theme in GPSD's design is
hiding all the device-dependent ugliness behind a simple client
interface talking to a zero-configuration service.

\end{aosasect1}

\begin{aosasect1}{The External View}

The main program in the GPSD suite is the \code{gpsd} service daemon.
It can collect the take from a set of attached sensor devices over
RS232, USB, Bluetooth, TCP/IP, and UDP links. Reports are normally
shipped to TCP/IP port 2947, but can also go out via a shared-memory
or D-BUS interface.

The GPSD distribution ships with client libraries for C, C++, and
Python.  It includes sample clients in C, C++, Python, and PHP. A Perl
client binding is available via CPAN.  These client libraries are not
merely a convenience for application developers; they save GPSD's
developers headaches too, by isolating applications from the details
of GPSD's JSON reporting protocol.  Thus, the API exposed to clients
can remain the same even as the protocol grows new features for new
sensor types.

Other programs in the suite include a utility for low-level device
monitoring (\code{gpsmon}), a profiler that produces reports on error
statistics and device timing (\code{gpsprof}), a utility for tweaking
device settings (\code{gpsctl}), and a program for batch-converting
sensor logs into readable JSON (\code{gpsdecode}). Together, they help
technically savvy users look as deeply into the operation of the
attached sensors as they care to.

Of course, these tools also help GPSD's own developers verify the
correct operation of \code{gpsd}. The single most important test tool
is \code{gpsfake}, a test harness for gpsd which can connect it to any
number of sensor logs as though they were live devices.  With
\code{gpsfake}, we can re-run a sensor log shipped with a bug report
to reproduce specific problems.  \code{gpsfake} is also the engine of
our extensive regression-test suite, which lowers the cost of
modifying the software by making it easy to spot changes that break
things.

One of the most important lessons we think we have for future projects
is that it is not enough for a software suite to be correct, it should
also be able to \emph{demonstrate its own correctness}.  We have found that
when this goal is pursued properly it is not a hair shirt but rather a
pair of wings---the time we've take to write test harnesses and
regression tests has paid for itself many times over in the freedom
it gives us to modify code without fearing that we are wreaking
subtle havoc on existing functionality.

\end{aosasect1}

\begin{aosasect1}{The Software Layers}

There is a lot more going on inside GPSD than the ``plug a sensor in
and it just works'' experience might lead people to assume.
\code{gpsd}'s internals break naturally into four pieces: the
\emph{drivers}, the \emph{packet sniffer}, the \emph{core library} and
the \emph{multiplexer}. We'll describe these from the bottom up.

% Insert the software-layers diagram here
\aosafigure[300pt]{../images/gpsd/software-layers.pdf}{Software layers}{fig.gpsd.layers}

The \emph{drivers} are essentially user-space device drivers for each
kind of sensor chipset we support.  The key entry points are methods
to parse a data packet into time-position-velocity or status
information, change its mode or baud rate, probe for device subtype,
etc.  Auxiliary methods may support driver control operations, such as
changing the serial speed of the device. The entire interface to a
driver is a C structure full of data and method pointers, deliberately
modeled on a Unix device driver structure.

The \emph{packet sniffer} is responsible for mining data packets out
of serial input streams.  It's basically a state machine that watches
for anything that looks like one of our 20 or so known packet types
(most of which are checksummed, so we can have high confidence when we
think we have identified one).  Because devices can hotplug or change
modes, the type of packet that will come up the wire from a serial or
USB port isn't necessarily fixed forever by the first one recognized.

The \emph{core library} manages a session with a sensor device.  The
key entry points are:

\begin{aosaitemize}

  \item starting a session by opening the device and reading data from
    it, hunting through baud rates and parity/stopbit combinations
    until the packet sniffer achieves synchronization lock with a
    known packet type;

  \item polling the device for a packet; and

  \item closing the device and wrapping up the session.

\end{aosaitemize}

A key feature of the core library is that it is responsible for
switching each GPS connection to using the correct device driver
depending on the packet type that the sniffer returns.  This is
\emph{not configured in advance} and may change over time, notably if
the device switches between different reporting protocols.  (Most GPS
chipsets support NMEA and one or more vendor binary protocols, and
devices like AIS receivers may report packets in two different
protocols on the same wire.)

Finally, the \emph{multiplexer} is the part of the daemon that handles
client sessions and device assignment.  It is responsible for passing
reports up to clients, accepting client commands, and responding to
hotplug notifications. It is essentially all contained in one source
file, \code{gpsd.c}, and never talks to the device drivers directly.

The first three components (other than the multiplexer) are linked
together in a library called \code{libgpsd} and can be used separately
from the multiplexer. Our other tools that talk to sensors directly,
such as \code{gpsmon} and \code{gpsctl}, do it by calling into the
core library and driver layer directly.

The most complex single component is the packet sniffer at about two
thousand lines of code.  This is irreducible; a state machine
that can recognize as many different protocols as it does is bound to
be large and gnarly.  Fortunately, the packet sniffer is also easy to
isolate and test; problems in it do not tend to be coupled to other
parts of the code.

The multiplexer layer is about same size, but somewhat less gnarly.
The device drivers make up the bulk of the daemon code at around 15
KLOC.  All the rest of the code---all the support tools and libraries
and test clients together---adds up to about the size of the daemon
(some code, notably the JSON parser, is shared between the daemon and
the client libraries).

The success of this layering approach is demonstrated in a couple of
different ways.  One is that new device drivers are so easy to write
that several have been contributed by people not on the core team: the
driver API is documented, and the individual drivers are coupled to
the core library only via pointers in a master device types table.

Another benefit is that system integrators can drastically reduce
GPSD's footprint for embedded deployment simply by electing not to
compile in unused drivers.  The daemon is not large to begin with, and
a suitably stripped-down build runs quite happily on low-power,
low-speed, small-memory ARM devices\footnote{ARM is a 32-bit RISC instruction set architecture used in mobile and embedded electronics. See \url{http://en.wikipedia.org/wiki/ARM_architecture}.}.

A third benefit of the layering is that the daemon multiplexer can be
detached from atop the core library and replaced with simpler logic,
such as the straight batch conversion of sensor logfiles to JSON
reports that the \code{gpsdecode} utility does.

There is nothing novel about this part of the GPSD architecture. Its
lesson is that conscious and rigorous application of the design
pattern of Unix device handling is beneficial not just in OS kernels
but also in userspace programs that are similarly required to deal
with varied hardware and protocols.

\end{aosasect1}

\begin{aosasect1}{The Dataflow View}

Now we'll consider GPSD's architecture from a dataflow view.  In
normal operation, gpsd spins in a loop waiting for input from one of
these sources:

\begin{aosaenumerate}

  \item A set of clients making requests over a TCP/IP port.

  \item A set of navigation sensors connected via serial or USB
    devices.

  \item The special control socket used by hotplug scripts and some
    configuration tools.

  \item A set of servers issuing periodic differential-GPS correction
    updates (DGPS and NTRIP).  These are handled as though they are
    navigation sensors.

\end{aosaenumerate}

When a USB port goes active with a device that might be a navigation
sensor, a hotplug script (shipped with GPSD) sends a notification to
the control socket.  This is the cue for the multiplexer layer to put
the device on its internal list of sensors.  Conversely, a
device-removal event can remove a device from that list.

When a client issues a watch request, the multiplexer layer opens the
navigation sensors in its list and begins accepting data from them (by
adding their file descriptors to the set in the main select
call). Otherwise all GPS devices are closed (but remain in the list)
and the daemon is quiescent. Devices that stop sending data get timed
out of the device list.

% Insert the dataflow diagram here
\aosafigure{../images/gpsd/dataflow.pdf}{Dataflow}{fig.gpsd.dataflow}

When data comes in from a navigation sensor, it's fed to the packet
sniffer, a finite-state machine that works like the lexical analyzer
in a compiler.  The packet sniffer's job is to accumulate data from
each port (separately), recognizing when it has accumulated a packet
of a known type.

A packet may contain a position fix from a GPS, a marine AIS datagram,
a sensor reading from a magnetic compass, a DGPS (Differential GPS)
broadcast packet, or any
of several other things.  The packet sniffer doesn't care about the 
content of the packet; all it does is tell the core library when it
has accumulated one and pass back the payload and the packet type.

The core library then hands the packet to the driver associated with
its type.  The driver's job is to mine data out of the packet payload
into a per-device session structure and set some status bits telling
the multiplexer layer what kind data it got.

One of those bits is an indication that the daemon has accumulated
enough data to ship a report to its clients.  When this bit is raised
after a data read from a sensor device, it means we've seen the end of
a packet, the end of a packet group (which may be one or more
packets), and the data in the device's session structure should be
passed to one of the exporters.

The main exporter is the ``socket'' one; it generates a report object
in JSON and ships it to all the clients watching the device. There's a
shared-memory exporter that copies the data to a shared-memory segment
instead. In either of these cases, it is expected that a client
library will unmarshal the data into a structure in the client
program's memory space.  A third exporter, which ships position
updates via DBUS, is also available.

The GPSD code is as carefully partitioned horizontally as it
vertically.  The packet sniffer neither knows nor needs to know
anything about packet payloads, and doesn't care whether its input
source is a USB port, an RS232 device, a Bluetooth radio link, a
pseudo-tty, a TCP socket connection, or a UDP packet stream.  The
drivers know how to analyze packet payloads, but know nothing about
either the packet-sniffer internals nor the exporters.  The exporters
look only at the session data structure updated by the drivers.

This separation of function has served GPSD very well. For example,
when we got a request in early 2010 to adapt the code to accept sensor
data coming in as UDP packets for the on-board navigation system of a
robot submarine, it was easy to implement that in a handful of lines
of code without disturbing later stages in the data pipeline.

More generally, careful layering and modularization has made it
relatively easy to add new sensor types.  We incorporate new drivers
every six months or so; some have been written by people who are not
core developers.

\end{aosasect1}

\begin{aosasect1}{Defending the Architecture}

As an open source program like \code{gpsd} evolves, one of the
recurring themes is that each contributor will do things to solve his or
her particular problem case which gradually leak more information
between layers or stages that were originally designed with clean
separation.

One that we're concerned about at the time of writing is that some
information about input source type (USB, RS232, pty, Bluetooth, TCP,
UDP) seems to need to be passed up to the multiplexer layer, to tell it,
for example, whether probe strings should be sent to an unidentified
device. Such probes are sometimes required to wake up RS232C sensors, but
there are good reasons not to ship them to any more devices than we
have to. Many GPSs and other sensor devices are designed on low
budgets and in a hurry; some can be confused to the point of catatonia
by unexpected control strings.

For a similar reason, the daemon has a \code{-b} option that prevents 
it from attempting baud-rate changes  during the packet-sniffer
hunt loop.  Some poorly made Bluetooth devices handle these so poorly
that they have to be power-cycled to function again; in one
extreme case a user actually had to unsolder the backup battery to
unwedge his!

Both these cases are necessary exceptions to the project's design
rules.  Much more usually, though, such exceptions are a bad thing.
For example, we've had some patches contributed to make PPS time
service work better that messed up the vertical layering, making it
impossible for PPS to work properly with more than the one driver they
were intended to help. We rejected these in favor of working harder
at device-type-independent improvement.

On one occasion some years ago, we had a request to support a GPS with
the odd property that the checksums in its NMEA packets may be invalid
when the device doesn't have a location fix.  To support this device,
we would have had to either (a) give up on validating the checksum on
\emph{any} incoming data that looked like an NMEA packet, risking that
the packet-sniffer would hand garbage to the NMEA driver, or (b) add a
command-line option to force the sensor type.

The project lead (the author of this chapter) refused to do either.
Giving up on NMEA packet validation was an obvious bad idea.  But a
switch to force the sensor type would have been an invitation to get
lazy about proper autoconfiguration, which would cause problems all
the way up to GPSD's client applications and their users.  The next
step down that road paved with good intentions would surely have been
a baud-rate switch. Instead, we declined to support this broken device.

One of the most important duties of a project's lead architect is to
defend the architecture against expedient ``fixes'' that would break
it and cause functional problems or severe maintenance headaches
down the road.  Arguments over this can get quite heated,
especially when defending architecture conflicts against something
that a developer or user considers a must-have feature.  But these
arguments are necessary, because the easiest choice is often the wrong
one for the longer term.

\end{aosasect1}

\begin{aosasect1}{Zero Configuration, Zero Hassles}

An extremely important feature of \code{gpsd} is that it is a
zero-configuration service\footnote{With one minor exception for
Bluetooth devices with broken firmware.}.  It has no dotfile!  The
daemon deduces the sensor types it's talking to by sniffing the
incoming data.  For RS232 and USB devices \code{gpsd} even autobauds
(that is, automatically detects the serial line speed), so it is not
necessary for the daemon to know in advance the speed/parity/stopbits
at which the sensor is shipping information.

When the host operating system has a hotplug capability, hotplug
scripts can ship device-activation and deactivation messages to a
control socket to notify the daemon of the change in its environment.
The GPSD distribution supplies these scripts for Linux.  The result
is that end users can plug a USB GPS into their laptop and expect
it to immediately begin supplying reports that location-aware
applications can read---no muss, no fuss, and no editing a 
dotfile or preferences registry.

The benefits of this ripple all the way up the application stack.
Among other things, it means that location-aware applications don't
have to have a configuration panel dedicated to tweaking the GPS and
port settings until the whole mess works. This saves a lot of effort
for application writers as well as users: they get to treat location
as a service that is nearly as simple as the system clock.

One consequence of the zero-configuration philosophy is that we do not
look favorably on proposals to add a config file or additional
command-line options.  The trouble with this is that configuration
which can be edited, \emph{must} be edited.  This implies adding setup
hassle for end users, which is precisely what a well-designed service
daemon should avoid.

The GPSD developers are Unix hackers working from deep inside the
Unix tradition, in which configurability and having lots of knobs
is close to being a religion.  Nevertheless, we think open source
projects could be trying a lot harder to throw away their dotfiles
and autoconfigure to what the running environment is actually doing.

\end{aosasect1}

\begin{aosasect1}{Embedded Constraints Considered Helpful}

Designing for embedded deployment has been a major goal of GPSD since
2005. This was originally because we got a lot of interest from system
integrators working with single-board computers, but it has since paid
off in an unexpected way: deployment on GPS-enabled smartphones. (Our
very favorite embedded-deployment reports are still the ones from the
robot submarines, though.)

Designing for embedded deployment has influenced GPSD in important
ways.  We think a lot about ways to keep memory footprint and CPU
usage low so the code will run well on low-speed, small-memory,
power-constrained systems.

One important attack on this issue, as previously mentioned, is to
ensure that \code{gpsd} builds don't have to carry any deadweight over
the specific set of sensor protocols that a system integrator needs to
support. In June 2011 a minimum static build of \code{gpsd} on an x86
system has a memory footprint of about 69K (that is \emph{with} all
required standard C libraries linked in) on 64-bit x86. For
comparison, the static build with all drivers is about 418K.

Another is that we profile for CPU hotspots with a slightly different
emphasis than most projects. Because location sensors tend to report
only small amounts of data at intervals on the order of 1 second,
performance in the normal sense isn't a GPSD issue---even grossly
inefficient code would be unlikely to introduce enough latency to be
visible at the application level.  Instead, our focus is on decreasing
processor usage and power consumption.  We've been quite successful at
this: even on low-power ARM systems without an FPU, \code{gpsd}'s
fraction of CPU is down around the level of profiler noise.

While designing the core code for low footprint and good power
efficiency is at this point largely a solved problem, there is one
respect in which targeting embedded deployments still produces tension
in the GPSD architecture: use of scripting languages. On the one hand,
we want to minimize defects due to low-level resource management by
moving as much code as possible out of C.  On the other hand, Python
(our preferred scripting language) is simply too heavyweight and slow
for most embedded deployments.

We've split the difference in the obvious way: the \code{gpsd} service
daemon is C, while the test framework and several of the support
utilities are written in Python. Over time, we hope to migrate more of
the auxiliary code out of C and into Python, but embedded deployment
makes those choices a continuing source of controversy and discomfort.

Still, on the whole we find the pressures from embedded deployment
quite bracing.  It feels good to write code that is lean, tight, and
sparing of processor resources.  It has been said that art comes from
creativity under constraints; to the extent that's true, GPSD is
better art for the pressure.

That feeling doesn't translate directly into advice for other
projects, but something else definitely does: don't guess, measure!
There is nothing like regular profiling and footprint measurements to
warn you when you're straying into committing bloat---and to reassure
you that you're not.

\end{aosasect1}

\begin{aosasect1}{JSON and the Architecturenauts}

One of the most significant transitions in the history of the project
was when we switched over from the original reporting protocol to
using JSON as a metaprotocol and passing reports up to clients as JSON
objects. The original protocol had used one-letter keys for commands
and responses, and we literally ran out of keyspace as the daemon's
capabilities gradually increased.

Switching to JSON was a big, big win. JSON combines the traditional
Unix virtues of a purely textual format---easy to examine with a Mark
1 Eyeball, easy to edit with standard tools, easy to generate
programmatically---with the ability to pass structured information in
rich and flexible ways.

By mapping report types to JSON objects, we ensured that any report
could contain mixes of string, numeric, and Boolean data with
structure (a capability the old protocol lacked).  By identifying
report types with a \code{"class"} attribute, we guaranteed that we
would always be able to add new report types without stepping on old
ones.

This decision was not without cost.  A JSON parser is a bit more
computationally expensive than the very simple and limited parser it
replaced, and certainly requires more lines of code (implying more
places for defects to occur). Also, conventional JSON parsers require
dynamic storage allocation in order to cope with the variable-length
arrays and dictionaries that JSON describes, and dynamic storage
allocation is a notorious defect attractor.

We coped with these problems in several ways. The first step was to
write a C parser for a (sufficiently) large subset of JSON that uses
entirely static storage.  This required accepting some minor
restrictions; for example, objects in our dialect cannot contain the
JSON \code{null} value, and arrays always have a fixed maximum length.
Accepting these restrictions allowed us to fit the parser into 600
lines of C.

We then built a comprehensive set of unit tests for the parser 
in order to verify error-free operation. Finally, for very tight
embedded deployments where the overhead of JSON might be too high,
we wrote a shared-memory exporter that bypasses the need to
ship and parse JSON entirely if the daemon and its client have
access to common memory.

JSON isn't just for web applications anymore.  We think anyone
designing an application protocol should consider an approach like
GPSD's.  Of course the idea of building your protocol on top of a
standard metaprotocol is not new; XML fans have been pushing it for
many years, and that makes sense for protocols with a document-like
structure. JSON has the advantages of being lower-overhead than XML
and better fitted to passing around array and record structures.

\end{aosasect1}

\begin{aosasect1}{Designing for Zero Defects}

Because of its use in navigational systems, any software that lives
between the user and a GPS or other location sensor is potentially
life-critical, especially at sea or when airborne.  Open source
navigation software has a tendency to try to evade this problem by
shipping with disclaimers that say, ``Don't rely on this if doing so
might put lives at risk.''

We think such disclaimers are futile and dangerous: futile because
system integrators are quite likely to treat them as pro-forma and
ignore them, and dangerous because they encourage developers to fool
themselves that code defects won't have serious consequences, and that
cutting corners in quality assurance is acceptable.

The GPSD project developers believe that the only acceptable policy is
to design for zero defects. Software complexity being what it is, we
have not quite achieved this---but for a project GPSD's size and age
and complexity we come very close.

Our strategy for doing this is a combination of architecture and
coding policies that aim to \emph{exclude the possibility of defects in
shipped code}.

One important policy is this: the \code{gpsd} daemon never uses
dynamic storage allocation---no \code{malloc} or \code{calloc}, and no
calls to any functions or libraries that require it.  At a stroke
this banishes the single most notorious defect attractor in C coding.
We have no memory leaks and no double-malloc or double-free bugs, and
we never will.

We get away with this because all of the sensors we handle emit
packets with relatively small fixed maximum lengths, and the daemon's
job is to digest them and ship them to clients with minimal buffering.
Still, banishing \code{malloc} requires coding discipline and some
design compromises, a few of which we previously noted in discussing
the JSON parser. We pay these costs willingly to reduce our defect
rate.

A useful side effect of this policy is that it increases the
effectiveness of static code checkers such as \code{splint},
\code{cppcheck}, and Coverity.  This feeds into another major policy
choice; we make extremely heavy use of both these code-auditing tools
and a custom framework for regression testing.  (We do not know of any
program suite larger than GPSD that is fully \code{splint}-annotated,
and strongly suspect that none such yet exist.)

The highly modular architecture of GPSD aids us here as well. The
module boundaries serve as cut points where we can rig test harnesses,
and we have very systematically done so.  Our normal regression test
checks everything from the floating-point behavior of the host
hardware up through JSON parsing to correct reporting behavior on over
seventy different sensor logs.

Admittedly, we have a slightly easier time being rigorous than many
applications would because the daemon has no user-facing interfaces;
the environment around it is just a bunch of serial data streams and is
relatively easy to simulate.  Still, as with banishing \code{malloc},
actually exploiting that advantage requires the right attitude, which
very specifically means being willing to spend as much design and
coding time on test tools and harnesses as we do on the production
code.  This is a policy we think other open-source projects can and
should emulate.

As I write (July 2011), GPSD's project bug tracker is empty.  It has
been empty for weeks, and based on past rates of bug submissions we can
expect it to stay that way for a good many more.  We haven't shipped
code with a crash bug in six years.  When we do have bugs, they tend
to be the sort of minor missing feature or mismatch with specification
that is readily fixed in a few minutes of work.

This is not to say that the project has been an uninterrupted idyll.  
Next, we'll review some of our mistakes{\ldots}

\end{aosasect1}

\begin{aosasect1}{Lessons Learned}

Software design is difficult; mistakes and blind alleys are all too
normal a part of it, and GPSD has been no exception to that rule.  The
largest mistake in this project's history was the design of the
original pre-JSON protocol for requesting and reporting GPS
information.  Recovering from it took years of effort, and there are
lessons in both the original mis-design and the recovery.

There were two serious problems with the original protocol:

\begin{aosaenumerate}

  \item Poor extensibility.  It used requests and response tags
    consisting of a single letter each, case-insensitive. Thus, for
    example, the request to report longitude and latitude was
    \code{"P"} and a response looked like \code{"P\,-75.32\,40.05"}. 
    Furthermore, the parser interpreted a request like
    \code{"PA"} as a \code{"P"} request followed by an \code{"A"}
    (altitude) request.  As the daemon's capabilities gradually
    broadened, we literally ran out of command space.

  \item A mismatch between the protocol's implicit model of sensor
    behavior and how they actually behave.  The old protocol was
    request/response: send a request for position (or altitude, or
    whatever) get back a report sometime later. In reality, it is
    usually not possible to request a report from a GPS or other
    navigation-related sensors; they stream out reports, and the best
    a request can do is query a cache.  This mismatch encouraged
    sloppy data-handling from applications; too often, they would ask
    for location data without also requesting a timestamp or any check
    information about the fix quality, a practice which could easily
    result in stale or invalid data getting presented to the user.

\end{aosaenumerate}

It became clear as early as 2006 that the old protocol design was
inadequate, but it took nearly three years of design sketches and 
false starts to design a new one.  The transition took two years
after that, and caused some pain for developers of client applications.
It would have cost a lot more if the project had not shipped client-side
libraries that insulated users from most of the protocol details---but
we didn't get the API of those libraries quite right either at first.

If we had known then what we know now, the JSON-based protocol would
have been introduced five years sooner, and the API design of the
client libraries would have required many fewer revisions. But there
are some kinds of lessons only experience and experiment can teach.

There are at least two design guidelines that future service daemons
could bear in mind to avoid replicating our mistakes:

\begin{aosaenumerate}

  \item Design for extensibility.  If your daemon's application
    protocol can run out of namespace the way our old one did, you're
    doing it wrong. Overestimating the short-term costs and
    underestimating the long-term benefits of metaprotocols like XML
    and JSON is an error that's all too common.

  \item Client-side libraries are a better idea than exposing the
    application protocol details. A library may be able to adapt its
    internals to multiple versions of the application protocol,
    substantially reducing both interface complexity and defect rates
    compared to the alternative, in which each application writer needs to
    develop an ad hoc binding.  This difference will translate
    directly into fewer bug reports on your project's tracker.

\end{aosaenumerate}

One possible reply to our emphasis on extensibility, not just in
GPSD's application protocol but in other aspects of the project
architecture like the packet-driver interface, is to dismiss it as an
over-elaboration brought about by mission creep.  Unix programmers
schooled in the tradition of ``do one thing well'' may ask whether
\code{gpsd}'s command set really needs to be larger in 2011 than it
was in 2006, why \code{gpsd} now handles non-GPS sensors like magnetic
compasses and Marine AIS receivers, and why we contemplate
possibilities like ADS-B aircraft tracking.

These are fair questions. We can approach an answer by looking at the
actual complexity cost of adding a new device type.  For very good
reasons, including relatively low data volumes and the high
electrical-noise levels historically associated with serial wires to
sensors, almost all reporting protocols for GPSs and other
navigation-related sensors look broadly similar: small packets with a
validation checksum of some sort.  Such protocols are fiddly to handle
but not really difficult to distinguish from each other and parse, and
the incremental cost of adding a new one tends to be less than a KLOC
each. Even the most complex of our supported protocols with their own
report generators attached, such as Marine AIS, only cost on the order
of 3 KLOC each. In aggregate, the drivers plus the packet-sniffer and
their associated JSON report generators are about 18 KLOC total.

Comparing this with 43 KLOC for the project as a whole, we see that
most of the complexity cost of GPSD is actually in the framework code
around the drivers---and (importantly) in the test tools and framework
for verifying the daemon's correctness.  Duplicating these would be a
much larger project than writing any individual packet parser.  So
writing a GPSD-equivalent for a packet protocol that GPSD doesn't
handle would be a great deal more work than adding another driver and
test set to GPSD itself.  Conversely, the most economical outcome (and
the one with the lowest expected cumulative rate of defects) is for
GPSD to grow packet drivers for many different sensor types.

The ``one thing'' that GPSD has evolved to do well is handle any
collection of sensors that ship distinguishable checksummed packets.
What looks like mission creep is actually preventing many different
and duplicative handler daemons from having to be written.  Instead,
application developers get one relatively simple API and the benefit
of our hard-won expertise at design and testing across an increasing
range of sensor types.

What distinguishes GPSD from a mere mission-creepy pile of features is
not luck or black magic but careful application of known best
practices in software engineering.  The payoff from these begins with
a low defect rate in the present, and continues with the ability to
support new features with little effort or expected impact on defect
rates in the future.

Perhaps the most important lesson we have for other open-source
projects is this: reducing defect rates asymptotically close to zero
is difficult, but it's not impossible---not even for a project as
widely and variously deployed as GPSD is.  Sound architecture, good
coding practice, and a really determined focus on testing can achieve
it---and the most important prerequisite is the discipline to pursue
all three.

\end{aosasect1}

\end{aosachapter}
